Seamless playback of composite media

ABSTRACT

A streaming composition system is described herein that provides easy workflow and playback capabilities for content producers to create composite media assets from existing and on-going media content and for streaming clients to seamlessly playback composite multimedia streams provided from different sources. These assets provide broadcasters an option to quickly turn around highlights for an on-going event. The streaming composition system allows a producer to identify clips within existing media assets and compose the clips into a new unified streaming presentation. For producers that already have smooth streaming media assets, the system leverages these assets to provide seamless playback across clip boundaries including advanced playback support for advertisement insertion, fast forward, rewind, and so on.

CROSS REFERENCE TO RELATED APPLICATIONS

This is a continuation application of application Ser. No. 14/179,817,filed Feb. 13, 2014, entitled “Seamless Playback of Composite Media”,and U.S. Pat. No. 8,683,337, filed Jun. 9, 2010, entitled “SeamlessPlayback of Composite Media”, the entirety of which is incorporatedherein by reference.

BACKGROUND

Streaming media is multimedia that is constantly received by, andnormally presented to, an end-user (using a client) while it is beingdelivered by a streaming provider (using a server). Several protocolsexist for streaming media, including the smooth streaming protocolintroduced by MICROSOFT™ Internet Information Server (IIS). Prior tosmooth streaming, most streaming media technologies used tight couplingbetween server and client with a stateful connection. The statefulconnection between client and server created additional server overhead(because the server tracked a current state of each client) and limitedthe scalability of the server.

MICROSOFT™ IIS Smooth Streaming (part of IIS Media Services, referred toherein as smooth streaming) provides stateless communication between theclient and server by breaking media up into chunks that are individuallyaddressable and can be individually requested by clients. For aparticular media event or content item, the smooth streaming serverprovides a manifest file that describes each of the chunks that comprisethe event. For example, a one-minute video provided by smooth streamingmay include 60 one-second audiovisual chunks. Each chunk containsmetadata and media content. The metadata may describe useful informationabout the media content, such as the bit rate of the media content,where the media content fits into a larger media element, a codec usedto encode the media content, and so forth. The client uses thisinformation to place the chunk into a storyboard of the larger mediaelement and to properly decode and playback the media content. Thechunks can be in any format, such as Motion Picture Experts Group (MPEG)4 boxes or other containers. A smooth streaming client plays a mediaevent to a user by reading the manifest and regularly requesting chunksfrom the server. The user may also skip around (e.g., seek, fastforward, rewind) and the client can provide these behaviors byrequesting later or earlier chunks described by the manifest. For liveevents, the server may provide the manifest to the client piecemeal, sothat the server informs the client of newly available chunks as theybecome available.

Each chunk may have its own Uniform Resource Locator (URL), allowingchunks to be cacheable by existing Internet infrastructure. The Internetcontains many types of downloadable media content items, includingaudio, video, documents, and so forth. These content items are oftenvery large, such as video in the hundreds of megabytes. Users oftenretrieve documents over the Internet using Hypertext Transfer Protocol(HTTP) through a web browser. The Internet has built up a largeinfrastructure of routers and proxies that are effective at caching datafor HTTP. Servers can provide cached data to clients with less delay andby using fewer resources than re-requesting the content from theoriginal source. For example, a user in New York may download a contentitem served from a host in Japan, and receive the content item through arouter in California. If a user in New Jersey requests the same file,the router in California may be able to provide the content item withoutagain requesting the data from the host in Japan. This reduces thenetwork traffic over possibly strained routes, and allows the user inNew Jersey to receive the content item with less latency.

While smooth streaming provides a great experience for viewing streamingmedia over the Internet and other networks, users often want to view(and producers of content often want to provide) content that comes fromdifferent sources or from different existing content items. For example,a sports network may want to provide a highlight video at the end ofeach day that includes some new commentary and some selections fromearlier media events. Today the sports network can provide links to eachvideo, but users may not want to view dozens of different video streamsor files. Producers of content do not want to re-encode or repackageeach earlier content item for rerelease as a new content item for thesetypes of purposes. Only by repackaging the content can the publisherprovide the user with familiar smooth streaming controls, such asskipping forward and backward in a stream. In many cases, the producermay want to provide quick turnaround to create highlights using acombination of on-demand and live assets immediately after an event oreven as an event is still on-going (e.g., for late joining viewers ofthe event).

SUMMARY

A streaming composition system is described herein that provides easyworkflow and playback capabilities for content producers to createcomposite media assets from existing and on-going media content and forstreaming clients to seamlessly playback composite multimedia streamsprovided from different sources. These assets provide broadcasters anoption to quickly make highlights for an on-going event available. Thestreaming composition system allows a producer to identify clips withinexisting media assets and compose the clips into a new unified streamingpresentation. For producers that already have smooth streaming mediaassets, the system reuses these assets to provide seamless playbackacross clip boundaries including advanced playback support foradvertisement insertion, fast forward, rewind, and so on. The systemallows having mark-in/mark-out points on one or more video streams tocreate a composite stream, so that the composite stream can include anyportion within the original media assets as well as inserting clips fromnew media assets. Thus, the streaming composition system provides astraightforward way for content producers to produce a compositepresentation from existing media assets and for clients to seamlesslyview the composite presentation as if it were a single mediapresentation.

This Summary is provided to introduce a selection of concepts in asimplified form that are further described below in the DetailedDescription. This Summary is not intended to identify key features oressential features of the claimed subject matter, nor is it intended tobe used to limit the scope of the claimed subject matter.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram that illustrates components of the streamingcomposition system, in one embodiment.

FIG. 2 is a flow diagram that illustrates processing of the streamingcomposition system to publish a composite media presentation, in oneembodiment.

FIG. 3 is a flow diagram that illustrates processing of the streamingcomposition system to play a composite media presentation, in oneembodiment.

FIG. 4 is a block diagram that illustrates a virtual timeline generatedby the streaming composition system, in one embodiment.

DETAILED DESCRIPTION

A streaming composition system is described herein that provides easyworkflow and playback capabilities for content producers to createcomposite media assets from existing and on-going media content and forstreaming clients to seamlessly playback composite multimedia streamsprovided from different sources. These assets provide broadcasters anoption to quickly turn around highlights for an on-going event (e.g., toattract more customers or provide a quick update to the late joiners).The streaming composition system allows a producer to identify clipswithin existing media assets and compose the clips into a new unifiedstreaming presentation. For producers that already have smooth streamingmedia assets, the system leverages these assets to provide seamlessplayback across clip boundaries including advanced playback support foradvertisement insertion, fast forward, rewind, and so on. The systemallows having mark-in/mark-out points on one or more video streams tocreate a composite stream, so that the composite stream can include anyportion within the original media assets as well as inserting clips fromnew media assets (e.g., to provide a commentary or advertisements alongwith the original media assets).

The system also provides a client component that implements a familiarclient application-programming interface (API) that can play and managecomposite media streams in a similar manner to prior (non-composite)smooth streaming playback. For example, MICROSOFT™ provides aSILVERLIGHT™ based control (the Smooth Streaming Media Element (SSME)class, part of the IIS Smooth Streaming Client) for playing smoothstreaming presentations, which the streaming composition system canmodify to provide for playback of composite streams. The compositestreams appear as a single stream to the end user. Thus, the streamingcomposition system provides a straightforward way for content producersto produce composite presentations from existing media assets and forclients to seamlessly view a composite presentation as if it were asingle media presentation.

In some embodiments, the streaming composition system defines acomposite manifest structure that includes portions of multiplemanifests from existing smooth streaming assets. The composite manifeststructure adds clips to the smooth streaming manifest, where each clipdefines an entry and exit boundary (e.g., a start and end time) into aseparate media asset. The composite manifest defines the order andcontent of each clip. The client component uses the composite manifestto determine which chunks of media data to request from the server. Insome embodiments, the client component builds a virtual timeline thatincludes a total duration determined by adding up the length of all ofthe clips that make up the composite stream. The client component maydisplay the virtual timeline to the user so that the user can skipwithin the video and see markers that identify portions of the video(e.g., each touchdown in a football game video clip). In someembodiments, the streaming composition system provides a tool throughwhich a content producer can specify existing assets and create acomposite manifest for delivery to clients. The producer uploads thecreated composite manifest to a web server to which clients refer tofind content. The client sets the URL of the web server as a source forplaying streaming media, and the client requests media data from theserver (or other servers) according to the instructions in the manifest.

FIG. 1 is a block diagram that illustrates components of the streamingcomposition system, in one embodiment. The system 100 includes a userinterface component 110, a clip identification component 120, acomposite manifest component 130, a manifest upload component 140, aclient component 150, a source identification component 160, a manifestparsing component 170, a virtual timeline component, and a playbackcomponent 190. Each of these components is described in further detailherein.

The user interface component 110 provides an interface through which acontent producer can select clips for inclusion in a composite mediastream. For example, the system may display a graphical user interface(GUI), console user interface (CUI), web page, or other facility forcollecting information from the producer identifying media assets andportions of the assets to be incorporated into a composite streamingpresentation. In some cases, the producer may use the user interfacewhile an event is still in progress to create clips indicatinghighlights or other relevant portions of a presentation related to theevent. The client may receive an updated manifest based on theproducer's addition of new clips.

The clip identification component 120 receives start and stopinformation about each of multiple clips to be included in the compositemedia stream. For example, the system may display a list of media assetsand a media-playing interface through which the content producer canview the media assets, selecting start and stop times to create a clip,and add the clip to the composite media stream.

The composite manifest component 130 creates a composite manifest thatdescribes the composite media stream, including an identification ofeach clip that comprises the composite media stream. For example, thecomponent 130 may create an extensible markup language (XML) file orother specification that describes the clips that comprise the compositemedia stream. Each clip's portion of the manifest may resemble amanifest of a single media presentation, and the composite manifestspecifies and orders the clips for seamless playback together, so thatthe multiple clips appear as one presentation to a viewing user.

The manifest upload component 140 uploads the created composite manifestto a server from which clients can retrieve the manifest. For example,the user interface component 110 may include a publish command thatallows the content producer to expose the created composite manifest forviewer consumption. Uploading or publishing may include placing themanifest in a directory or file structure of one or more origin servers.As clients retrieve the composite manifest, the manifest may be presentin one or more caches so that additional clients can retrieve themanifest without contacting the original server.

The client component 150 provides a client-side API for playing backcomposite media streams. The component 150 allows individual websites orapplications to customize behavior of the system based on their ownpurposes, such as to display branding and other content in associationwith a streaming media presentation. The client component 150 includesfunctions for setting up playback of a composite media stream, includingidentifying a source of the stream from which to download the compositemanifest, and other options such as a default bit rate. The clientcomponent 150 may also include functions for managing a playing stream,such as controls for skipping, rewinding, fast-forwarding, pausing,advancing to bookmarked sections of the stream, and so forth as well asstream/track selection functions to choose other streams related to thecurrent presentation (e.g., streams with captions, multiple audiolanguages, and so on). The client component 150 allows applications toleverage the system on top of a platform (SILVERLIGHT™) that has nonative notion or support for seamless clip-stitching or trick play(e.g., fast-forward and rewind). The client component 150 handles theidentification and retrieval of media content into a form that theunderlying software platform can handle the same as if it were playinglocal media content.

The client component 150 may include other subcomponents (not shown),such as a heuristics component, a state management component, and aconfiguration component. The heuristics component analyzes the successof receiving packets from the server and adapts the client's requestsbased on a set of current network and other conditions. For example, ifthe client is routinely receiving media chunks late, then the componentdetermine that the bandwidth between the client and the server isinadequate for the current bit rate, and the client may begin requestingmedia chunks at a lower bit rate. The state management component managesstate information about ongoing playback and provides clientapplications with consistent state indications. The component allowsstate management within the SSME class and provides developers an easyway to track the state of the playback without having to understandpotentially changing conditions. The configuration component provides aninterface through which developers can configure the system 100.

The source identification component 160 identifies a source of thecomposite media stream, wherein the source provides the compositemanifest. For example, a user may select or a web page may incorporate alink to a particular origin server (or servers) that provides thecomposite manifest. In some cases, the same server may serve the mediacontent, but the content producer can also specify playback of contentfrom other servers within the composite manifest.

The manifest parsing component 170 retrieves the composite manifest,identifies clips specified within the composite manifest, and preparesthe clips for playback. The manifest parsing component may navigate anXML or other document that comprises the composite manifest and identifythe clips specified within the composite manifest so that othercomponents can prepare for playback. For example, the component 170 mayidentify the first few chunks to be retrieved to playback the compositestream and begin retrieving the chunks.

The virtual timeline component 180 builds a virtual timeline that spansone or more clips specified by the composite manifest. Because thecomposite stream is built of multiple clips of varying lengths, thetimeline of any particular clip does not reflect the timeline for thecomposite stream. Users often expect to view the length of playing mediacontent, and to be able to select locations on a playback bar where thelocations map to times within the presentation. For example, a userviewing a one-minute presentation may click near a 45-second mark toquickly view the end of the presentation. The virtual timeline component180 determines a virtual timeline based on the information about eachclip in the composite manifest. For example, the component 180 may addup the duration of each clip and include any additional time (e.g., forinserted advertisements that may not be represented as clips) todetermine an overall duration of the composite presentation and displaya playback bar. For live events or composite presentations that includelive clips, the system may provide a default amount of time on aplayback bar (e.g., one hour) and periodically update the bar as thepresentation continues.

The playback component 190 plays the composite media stream using clienthardware. For example, the component 190 may invoke client APIs, such asMICROSOFT™ SILVERLIGHT™ or MICROSOFT™ DirectX APIs for playing backmultimedia content using one or more client codecs appropriate for theencoding of each of the clips within the composite media stream. Theplayback component 190 responds to user controls and invokes the clientcomponent 150 and other components to perform any further processing,such as skipping to a new clip based on a user's selections along thevirtual timeline and retrieving chunks for the new clip from one or moreservers. In some embodiments, the source identification component 160,manifest parsing component 170, manifest parsing component 180, andplayback component 190 may be part of or associated with the clientcomponent 150 and perform operations on the client to consume compositemanifests retrieved from a server.

The computing device on which the streaming composition system isimplemented may include a central processing unit, memory, input devices(e.g., keyboard and pointing devices), output devices (e.g., displaydevices), and storage devices (e.g., disk drives or other non-volatilestorage media). The memory and storage devices are computer-readablestorage media that may be encoded with computer-executable instructions(e.g., software) that implement or enable the system. In addition, thedata structures and message structures may be stored or transmitted viaa data transmission medium, such as a signal on a communication link.Various communication links may be used, such as the Internet, a localarea network, a wide area network, a point-to-point dial-up connection,a cell phone network, and so on.

Embodiments of the system may be implemented in various operatingenvironments that include personal computers, server computers, handheldor laptop devices, multiprocessor systems, microprocessor-based systems,programmable consumer electronics, digital cameras, network PCs,minicomputers, mainframe computers, distributed computing environmentsthat include any of the above systems or devices, set top boxes, systemson a chip (SOCs), and so on. The computer systems may be cell phones,personal digital assistants, smart phones, personal computers,programmable consumer electronics, digital cameras, and so on.

The system may be described in the general context ofcomputer-executable instructions, such as program modules, executed byone or more computers or other devices. Generally, program modulesinclude routines, programs, objects, components, data structures, and soon that perform particular tasks or implement particular abstract datatypes. Typically, the functionality of the program modules may becombined or distributed as desired in various embodiments.

FIG. 2 is a flow diagram that illustrates processing of the streamingcomposition system to publish a composite media presentation, in oneembodiment. Beginning in block 210, the system provides an interfacethrough which a content producer can identify media assets from which tocompose clips into a seamless composite media presentation. For example,the system may display a GUI from which the producer can select one ormore on-demand or live presentations from which to specify clips.Continuing in block 220, the system receives information identifying oneor more clips for the composite media presentation. For example, theproducer may successively select one or more previous presentations andidentify start and stop times of one or more locations within theprevious presentations from which to create clips. For example, theclips may represent highlights of a sporting event.

Continuing in block 230, the system identifies the one or more clipsidentified by the received information and stores information forplaying each identified clip. For example, the system may identify oneor more chunks that correspond to the locations in the previouspresentation specified by the clip. The system may also identify otherinformation relevant to the clip, such as available and default bitrates, encodings available, supplemental information such as captions,and so forth. Continuing in block 240, the system creates a compositemanifest that specifies information about the composite mediapresentation and each clip associated with the composite mediapresentation. The manifest allows the client to play clip after clip asa single presentation to the user without asking the user to identifyeach subsequent item to play. The composite manifest can be an XML file,rows in a database, or any other suitable facility for identifying theclips and an order in which they are to be presented to the client.

Continuing in block 250, the system uploads the created compositemanifest to one or more servers for access by clients. For example, thesystem may copy the composite manifest (e.g., a CSM file) to a webserver from which clients can retrieve the manifest at a well-definedURL. The system may also publish the URL, such as on a web page thatprovides a directory of media assets available from a site (e.g., thefront page or sports page of a news site). Note that in many cases,composing the composite manifest may be the only action for the producerto publish a new composite media presentation. The presentation mayleverage existing media assets already published and available from oneor more servers, and the system provides an easy way for the producer tocreate new presentations from those assets. After block 250, these stepsconclude.

FIG. 3 is a flow diagram that illustrates processing of the streamingcomposition system to play a composite media presentation, in oneembodiment. Beginning in block 310, the system receives anidentification of a source of the composite media presentation. Forexample, an application invoked by the user through a web browser mayinvoke a client API and specify a URL that identifies the compositemedia presentation. The URL may refer to a location hosted by one ormore origin servers and cached by specialized and general-purpose cacheservers on a network.

Continuing in block 320, the system retrieves a composite manifest thatdescribes the composite media presentation from the identified source.For example, if the source is identified using a URL, then the systemmay retrieve the manifest by issuing an HTTP GET request specifying adomain, virtual directory, and file name of the manifest. The systemreceives the manifest in response, such as via an HTTP 200 OK responsewith associated data. Continuing in block 330, the system parses theretrieved manifest to identify one or more clips associated with thecomposite manifest. A composite manifest identifies one or more clips,such as via a hierarchical scheme like that provided by XML. An XMLcomposite manifest file may include a tag for each clip and sub-tagsthat identify information about the clips, such as smooth streamingmedia chunks that comprise the clips and a location for retrieving thechunks.

Continuing in block 340, the system generates a virtual timeline thatspans multiple clips associated with the composite manifest. Forexample, the system may determine an overall duration of the compositemedia presentation based on a duration associated with each clip. Incases where a clip's duration is not known, such as for an ongoing livepresentation, the system may contribute an estimate or default amount tothe virtual timeline. The virtual timeline allows a viewer of thepresentation to seek freely within the composite media presentationseamlessly without knowledge of clip boundaries. Continuing in block350, the system selects a first identified clip. The system retrievesthe clip information from the composite manifest and identifies one ormore chunks associated with the clip. The chunk information may containa URL for retrieving chunks, bitrate information, and other mediainformation. On subsequent iterations, the system selects the nextidentified clip.

Continuing in block 360, the system retrieves one or more clip chunksassociated with the selected clip. For example, the system may issue anHTTP GET request to retrieve a chunk associated with a particular URLspecified in the composite manifest. These requests are the same onesmade for the original source, allowing the client to potentially receivecached assets from earlier requests, thereby increasing efficiency. Thesystem may receive the chunk from one or more origin servers or from acache server (e.g., if the clip chunk has been previously requested byother clients). Continuing in block 370, the system plays one or moreretrieved clip chunks in order. Each chunk may include an MPEG Group ofPictures (GOP), a frame, or other suitable subdivision of a mediapresentation for transmission in parts from a server to a client.Although shown serially, the system may continuously retrieve clipchunks and play them back as they become available. These processes mayoccur in parallel, such as on different threads, and the system maybegin retrieving chunks associated with subsequent clips before theselected clip is done playing, such as based on extra availablebandwidth. The steps are shown serially herein for ease of illustration,but those of ordinary skill in the art will recognize variousmodifications and optimizations that can achieve similar results andexhibit other positive qualities.

Continuing decision in block 380, if the system determines that moreclips are available in the composite manifest, then the system loops toblock 350 to select the next clip, else the system completes. The systemcontinues playing each clip specified in the composite manifest untilthe end of the composite media presentation. After block 380, thesesteps conclude.

FIG. 4 is a block diagram that illustrates a virtual timeline generatedby the streaming composition system, in one embodiment. The diagramincludes a composite media stream 410 made up of four clips, such as afirst clip 420. Each clip identifies clip-based start and end times 430.For example, the first clip starts at 31 units (e.g., seconds) and endsat 41 units. The system maps all of the clips onto a virtual timeline440 that includes times running from zero until the end of the compositemedia presentation. The client software can paint the virtual timelineas a playback bar or other control from which the user can seek orselect other controls to modify playback of the presentation.

Following is an example of a manifest format specification thatillustrates one embodiment of the streaming composition system and thefeatures described herein. It is helpful to start with an example of acurrent manifest file, such as that provided by MICROSOFT™ InternetInformation Server Smooth Streaming.

<?xml version=“1.0” encoding=“utf-16” ?> <!-- Created with ExpressionEncoder version 3.0.1332.0--> <SmoothStreamingMedia MajorVersion=“2”MinorVersion=“0” Duration=“300000000”>  <StreamIndex Type=“video”Chunks=“15” QualityLevels=“8” MaxWidth=“640” MaxHeight=“480”DisplayWidth=“640” DisplayHeight=“480”Url=“QualityLevels({bitrate})/Fragments(video={start time})”>  <QualityLevel Index=“0” Bitrate=“1644000” FourCC=“WVC1” MaxWidth=“640”MaxHeight=“480”CodecPrivateData=“250000010FCBF213F0EF8A13F83BE80C9081B22B6457400000010E5A67F840” />   <QualityLevel Index=“1” Bitrate=“1241000” FourCC=“WVC1”MaxWidth=“640” MaxHeight=“480”CodecPrivateData=“250000010FCBE613F0EF8A13F83BE80C9081A5DECBBE400000010E5A67F840” />   <QualityLevel Index=“2” Bitrate=“937000” FourCC=“WVC1”MaxWidth=“640” MaxHeight=“480”CodecPrivateData=“250000010FCBDC13F0EF8A13F83BE80C90811C97F260C00000010E5A67F840” />   <QualityLevel Index=“3” Bitrate=“708000” FourCC=“WVC1”MaxWidth=“428” MaxHeight=“320”CodecPrivateData=“250000010FCBD60D509F8A0D5827E80C9081159AD66CC00000010E5A67F840” />   <QualityLevel Index=“4” Bitrate=“534000” FourCC=“WVC1”MaxWidth=“428” MaxHeight=“320”CodecPrivateData=“250000010FCBD00D509F8A0D5827E80C9081104B412F400000010E5A67F840” />   <QualityLevel Index=“5” Bitrate=“403000” FourCC=“WVC1”MaxWidth=“428” MaxHeight=“320”CodecPrivateData=“250000010FCBCC0D509F8A0D5827E80C90808C4BE263400000010E5A67F840” />   <QualityLevel Index=“6” Bitrate=“305000” FourCC=“WVC1”MaxWidth=“364” MaxHeight=“272”CodecPrivateData=“250000010FC3C80B50878A0B5821E80C9080894E4A76400000010E5A67F840” />   <QualityLevel Index=“7” Bitrate=“230000” FourCC=“WVC1”MaxWidth=“364” MaxHeight=“272”CodecPrivateData=“250000010FC3C60B50878A0B5821E80C90800704704DC00000010E5A67F840” />   <c t=“0” />   <c t=“22350000” />   <c t=“42370000” />  <c t=“62390000” />   <c t=“82410000” />   <c t=“102430000” />   <ct=“122450000” />   <c t=“142470000” />   <c t=“162490000” />   <ct=“182510000” />   <c t=“202530000” />   <c t=“222550000” />   <ct=“242570000” />   <c t=“262590000” />   <c t=“282610000” d=“17350001”/>  </StreamIndex>  <StreamIndex Type=“audio” Index“0” FourCC=“WMAP”Chunks=“15” QualityLevels=“1”Url=“QualityLevels({bitrate})/Fragments(audio={start time})”>  <QualityLevel Bitrate=“192000” SamplingRate=“44100” Channels=“2”BitsPerSample=“16” PacketSize=“8917” AudioTag=“354”CodecPrivateData=“1000030000000000000000000000E0000000” />   <c t=“0” />  <c t=“22291156” />   <c t=“40867120” />   <c t=“60371882” />   <ct=“84056235” />   <c t=“100774603” />   <c t=“121208163” />   <ct=“143034920” />   <c t=“160682086” />   <c t=“181580045” />   <ct=“202013605” />   <c t=“221518367” />   <c t=“242880725” />   <ct=“260789115” />   <c t=“282354648” d=“17993650” />  </StreamIndex></SmoothStreamingMedia>

Each SmoothStreamingMedia element contains one or more StreamIndexelements that specify available streams for the presentation. Forexample, one stream may represent video data and the other stream mayrepresent audio data. Each StreamIndex element contains one or moreQualityLevel elements and ‘c’ elements. The QualityLevel elementsdescribe one or more available bitrates or other quality types availablefor the StreamIndex parent. The ‘c’ elements describe the availablechunks of the media presentation, and include a time specification thatdesignates an offset within the overall presentation where the chunkbegins. The final chunk may include a total duration.

Applying the streaming composition system to the above example manifest,suppose that a content producer wants to include a clip of the first 10seconds of the above presentation in a new composite media presentation.This can be represented in a composite manifest as follows(http://abcxyz.com/sample.ism/Manifest points to the URL for themanifest of the original presentation).

<?xml version=“1.0” encoding=“utf-16” ?> <SmoothStreamingMediaMajorVersion=“2” MinorVersion=“0” Duration=“200000000”> <ClipUrl=“http://abcxyz.com/sample.ism/Manifest” ClipBegin=“0”ClipEnd=“100000000”>  <StreamIndex Type=“video” Chunks=“5”QualityLevels=“8” MaxWidth=“640” MaxHeight=“480” DisplayWidth=“640”DisplayHeight=“480” Url=“QualityLevels({bitrate})/Fragments(video={starttime})”>   <QualityLevel Index=“0” Bitrate=“1644000” FourCC=“WVC1”MaxWidth=“640” MaxHeight=“480”CodecPrivateData=“250000010FCBF213F0EF8A13F83BE80C9081B22B6457400000010E5A67F840” />   <QualityLevel Index=“1” Bitrate=“1241000” FourCC=“WVC1”MaxWidth=“640” MaxHeight=“480”CodecPrivateData=“250000010FCBE613F0EF8A13F83BE80C9081A5DECBBE400000010E5A67F840” />   <QualityLevel Index=“2” Bitrate=“937000” FourCC=“WVC1”MaxWidth=“640” MaxHeight=“480”CodecPrivateData=“250000010FCBDC13F0EF8A13F83BE80C90811C97F260C00000010E5A67F840” />   <QualityLevel Index=“3” Bitrate=“708000” FourCC=“WVC1”MaxWidth=“428” MaxHeight=“320”CodecPrivateData=“250000010FCBD60D509F8A0D5827E80C9081159AD66CC00000010E5A67F840” />   <QualityLevel Index=“4” Bitrate=“534000” FourCC=“WVC1”MaxWidth=“428” MaxHeight=“320”CodecPrivateData=“250000010FCBD00D509F8A0D5827E80C9081104B412F400000010E5A67F840” />   <QualityLevel Index=“5” Bitrate=“403000” FourCC=“WVC1”MaxWidth=“428” MaxHeight=“320”CodecPrivateData=“250000010FCBCC0D509F8A0D5827E80C90808C4BE263400000010E5A67F840” />   <QualityLevel Index=“6” Bitrate=“305000” FourCC=“WVC1”MaxWidth=“364” MaxHeight=“272”CodecPrivateData=“250000010FC3C80B50878A0B5821E80C9080894E4A76400000010E5A67F840” />   <QualityLevel Index=“7” Bitrate=“230000” FourCC=“WVC1”MaxWidth=“364” MaxHeight=“272”CodecPrivateData=“250000010FC3C60B50878A0B5821E80C90800704704DC00000010E5A67F840” />   <c t=“0” />   <c t=“22350000” />   <c t=“42370000” />  <c t=“62390000” />   <c t=“82410000” d=“20020000” />  </StreamIndex> <StreamIndex Type=“audio” Index=“0” FourCC=“WMAP” chunks=“5”QualityLevels=“1” Url=“QualityLevels({bitrate})/Fragments(audio={starttime})”>   <QualityLevel Bitrate=“192000” SamplingRate=“44100”Channels=“2” BitsPerSample=“16” PacketSize=“8917” AudioTag=“354”CodecPrivateData=“1000030000000000000000000000E0000000” />   <c t=“0” />  <c t=“22291156” />   <c t=“40867120” />   <c t=“60371882” />   <ct=“84056235” d=“16718368” />  </StreamIndex> </Clip></SmoothStreamingMedia>

Note that in this example implementation, the difference is thatSmoothStreamingMedia elements now contain a new Clip element. Each Clipelement contains StreamIndex elements and their child elements asbefore. The Clip element has attributes called Url, ClipBegin, andClipEnd. The Url attribute specifies the URL to the original sourcemanifest from which these clips were cut. The value is similar to what aprogrammer or application would set on the SmoothStreamingSourceproperty in a Smooth Streaming Media Element using MICROSOFT™SILVERLIGHT™ (e.g., http://abcxyz.com/sample.ism/Manifest). TheClipBegin attribute specifies a time in nanoseconds at which to beginthe playback for the clip. The ClipEnd attribute specifies a time innanoseconds at which to end the playback for the clip.

The ‘c’ elements are still present in this composite manifest so thatthis manifest is self-sufficient for playing the composite mediapresentation (although other embodiments could rely on the clientretrieving the original manifest to play the clips), without downloadingthe source manifest. Not all ‘c’ elements are included in this manifest,as the composite manifest can avoid including chunks of the sourcepresentation that are not within the clip boundaries. In this case,times are close to the ClipBegin and ClipEnd. They may not exactly matchin timestamps as chunks may have one granularity (e.g., two seconds)while the clip is cut at a finer granularity (e.g., one second). Theclient can handle playing portions of chunks based on the clip times.

The previous example illustrated how to compose a single clip. Thefollowing example expands this to multiple such clips stitched together.Following is an example manifest for two clips.

<?xml version=“1.0” encoding=“utf-16”?> <SmoothStreamingMediaMajorVersion=“2” MinorVersion=“0” Duration=“200000000”>  <ClipUrl=“http://abcxyz.com/sample.ism/Manifest” ClipBegin=“0”ClipEnd=“100000000”>   <StreamIndex Type=“video” Chunks=“5”QualityLevels=“8” MaxWidth=“640” MaxHeight=“480” DisplayWidth=“640”DisplayHeight=“480” Url=“QualityLevels({bitrate})/Fragments(video={starttime})”>    <QualityLevel Index=“0” Bitrate=“1644000” FourCC=“WVC1”MaxWidth=“640” MaxHeight=“480”CodecPrivateData=“250000010FCBF213F0EF8A13F83BE80C9081B22B6457400000010E5A67F840” />    <QualityLevel Index=“1” Bitrate=“1241000” FourCC=“WVC1”MaxWidth=“640” MaxHeight=“480”CodecPrivateData=“250000010FCBE613F0EF8A13F83BE80C9081A5DECBBE400000010E5A67F840” />    <QualityLevel Index=“2” Bitrate=“937000” FourCC=“WVC1”MaxWidth=“640” MaxHeight=“480”CodecPrivateData=“250000010FCBDC13F0EF8A13F83BE80C90811C97F260C00000010E5A67F840” />    <QualityLevel Index=“3” Bitrate=“708000” FourCC=“WVC1”MaxWidth=“428” MaxHeight=“320”CodecPrivateData=“250000010FCBD60D509F8A0D5827E80C9081159AD66CC00000010E5A67F840” />    <QualityLevel Index=“4” Bitrate=“534000” FourCC=“WVC1”MaxWidth=“428” MaxHeight=“320”CodecPrivateData=“250000010FCBD00D509F8A0D5827E80C9081104B412F400000010E5A67F840” />    <QualityLevel Index=“5” Bitrate=“403000” FourCC=“WVC1”MaxWidth=“428” MaxHeight=“320”CodecPrivateData=“250000010FCBCC0D509F8A0D5827E80C90808C4BE263400000010E5A67F840” />    <QualityLevel Index=“6” Bitrate=“305000” FourCC=“WVC1”MaxWidth=“364” MaxHeight=“272”CodecPrivateData=“250000010FC3C80B50878A0B5821E80C9080894E4A76400000010E5A67F840” />    <QualityLevel Index=“7” Bitrate=“230000” FourCC=“WVC1”MaxWidth=“364” MaxHeight=“272”CodecPrivateData=“250000010FC3C60B50878A0B5821E80C90800704704DC00000010E5A67F840” />    <c t=“0” />    <c t=“22350000” />    <c t=“42370000” />   <c t=“62390000” />    <c t=“82410000” d=“20020000” />  </StreamIndex>   <StreamIndex Type=“audio” Index=“0” FourCC=“WMAP”Chunks=“5” QualityLevels=“1”Url=“QualityLevels({bitrate})/Fragments(audio={start time})”>   <QualityLevel Bitrate=“192000” SamplingRate=“44100” Channels=“2”BitsPerSample=“16” PacketSize=“8917” AudioTag=“354”CodecPrivateData=“1000030000000000000000000000E0000000” />    <c t=“0”/>    <c t=“22291156” />    <c t=“40867120” />    <c t=“60371882” />   <c t=“84056235” d=“16718368” />   </StreamIndex>  </Clip>  <ClipUrl=“http://abcxyz.com/sample2.ism/Manifest” ClipBegin=“60000000”ClipEnd=“160000000”>   <StreamIndex Type=“video” Chunks=“5”QualityLevels=“8” MaxWidth=“848” MaxHeight=“476” DisplayWidth=“848”DisplayHeight=“476” Url=“QualityLevels({bitrate})/Fragments(video={starttime})”>    <QualityLevel Index=“0” Bitrate=“1644000” FourCC=“WVC1”MaxWidth=“848” MaxHeight=“476”CodecPrivateData=“250000010FCBB21A70ED8A1A783B68045081B22B6457400000010E5A67F840” />    <QualityLevel Index=“1” Bitrate=“1241000” FourCC=“WVC1”MaxWidth=“848” MaxHeight=“476”CodecPrivateData=“250000010FCBA61A70ED8A1A783B68045081A5DECBBE400000010E5A67F840” />    <QualityLevel Index=“2” Bitrate=“937000” FourCC=“WVC1”MaxWidth=“848” MaxHeight=“476”CodecPrivateData=“250000010FCB9C1A70ED8A1A783B680450811C97F260C00000010E5A67F840” />    <QualityLevel Index=“3” Bitrate=“708000” FourCC=“WVC1”MaxWidth=“568” MaxHeight=“320”CodecPrivateData=“250000010FCB9611B09F8A11B827E8045081159AD66CC00000010E5A67F840” />    <QualityLevel Index=“4” Bitrate=“534000” FourCC=“WVC1”MaxWidth=“568” MaxHeight=“320”CodecPrivateData=“250000010FCB9011B09F8A11B827E8045081104B412F400000010E5A67F840” />    <QualityLevel Index=“5” Bitrate=“403000” FourCC=“WVC1”MaxWidth=“568” MaxHeight=“320”CodecPrivateData=“250000010FCB8C11B09F8A11B827E80450808C4BE263400000010E5A67F840” />    <QualityLevel Index=“6” Bitrate=“305000” FourCC=“WVC1”MaxWidth=“480” MaxHeight=“272”CodecPrivateData=“250000010FCB880EF0878A0EF821E8045080894E4A76400000010E5A67F840” />    <QualityLevel Index=“7” Bitrate=“230000” FourCC=“WVC1”MaxWidth=“480” MaxHeight=“272”CodecPrivateData=“250000010FCB860EF0878A0EF821E80450800704704DC00000010E5A67F840” />    <c t=“60000000” />    <c t=“80000000” />    <ct=“100000000” />    <c t=“120000000” />    <c t=“140000000”d=“20000000”/>   </StreamIndex>   <StreamIndex Type=“audio” Index=“0”FourCC=“WMAP” Chunks=“6” QualityLevels=“1”Url=“QualityLevels({bitrate})/Fragments(audio={start time})”>   <QualityLevel Bitrate=“192000” SamplingRate=“44100” Channels=“2”BitsPerSample=“16” PacketSize=“8917” AudioTag=“354”CodecPrivateData=“1000030000000000000000000000E0000000” />    <ct=“42724716” />    <c t=“61082992” />    <c t=“80341043” />    <ct=“103096598” />    <c t=“120279365” />    <c t=“142570521” d=“21362358”/>   </StreamIndex>  </Clip> </SmoothStreamingMedia>

A content producer can upload this manifest along with the originalmanifests and content (which may be already uploaded and available) toan HTTP accessible location. A client application sets its source to theURL of the composite manifest, and the client component handles playbackof the composite media presentation. The following is an ExtensibleApplication Markup Language (XAML) example for playing a composite mediapresentation using MICROSOFT™ SILVERLIGHT™.

<UserControl x:Class=“SilverlightApplication6.MainPage” xmlns=“http://schemas.microsoft.com/winfx/2006/xaml/presentation” xmlns:x=“http://schemas.microsoft.com/winfx/2006/xaml”   xmlns:SSME=“clr-namespace:Microsoft.Web.Media.SmoothStreaming;assembly=Microsoft.Web.Media.SmoothStreaming” xmlns:d=“http://schemas.microsoft.com/expression/blend/2008”xmlns:mc=“http://schemas.openxmlformats.org/markup-compatibility/ 2006” mc:Ignorable=“d” d:DesignWidth=“640” d:DesignHeight=“480”>  <Gridx:Name=“LayoutRoot”>   <SSME:SmoothStreamingMediaElementSmoothStreamingSource=“http://abcxyz.com/SampleRCEManifest.csm”x:Name=“SmoothPlayer” />  </Grid> </UserControl>

In some embodiments, the streaming composition system retrieves chunksearlier than the start of a particular clip to provide a playback systemwith sufficient starting information. For example, where a clip does notalign well with a frame boundary or where chunks include intermediateframes, the stream composition system may identify a prior chunk thatincludes a key frame or other information for providing a suitable startof playback to the playback system. After providing the playback systemwith starting information, the streamlining composition system instructsthe playback system to seek to the appropriate location identified bythe clip. This allows the playback system to smoothly start playback atthe specified clip boundary. The system may also expand the virtualtimeline to include the first clip's last I-frame.

Using the above technique and others described herein, the streamingcomposition system can transition between clips that do not specifylocations within a media stream that are at an I-frame boundary (i.e.,the system does not require switching at an I-frame boundary). Inaddition, the system can perform near frame-level granularitytransitions, where a frame from one clip is immediately followed by asubsequent frame from another clip.

In some embodiments, the streaming composition system uses the sameaudio encoding settings and video codecs across clips. By doing so, thesystem can avoid resetting any underlying hardware or playback systeminformation upon transitioning from clip to clip. In addition, this canallow the system to drop new clip data into existing playback buffersand rely on the underlying playback system to seamlessly playbackmultiple clips.

In some embodiments, the streaming composition system handles redirectsduring playback. It is possible that either the composite manifest URLor any source server URL will refer to content that has been moved.Movements on web servers are often indicated by an HTTP 301 or 302redirect response that indicates a new server. The streaming compositionsystem may leave time during retrieval to allow sufficient time tohandle any redirects and retrieve content from a new location.

In some embodiments, the streaming composition system leverages cachingof the original clip sources. Because each chunk is an individuallycacheable HTTP content item, caches may contain chunks for the originalsource media presentations. Because the composite manifest refers toclip chunks using the same URL as the original presentation, the systembenefits from any existing caching of the source presentations. Thus,the system makes highlights quickly available to clients.

From the foregoing, it will be appreciated that specific embodiments ofthe streaming composition system have been described herein for purposesof illustration, but that various modifications may be made withoutdeviating from the spirit and scope of the invention. Accordingly, theinvention is not limited except as by the appended claims.

I/We claim:
 1. A system, comprising: at least one processor and amemory; the at least one processor configured to: provide an interfacethrough which a content producer can identify media assets from which tocompose clips from one or more sources into a seamless composite mediapresentation; receive information identifying one or more clips for thecomposite media presentation; identify the one or more clips identifiedby the received information and store information to play eachidentified clip; and create a composite manifest that specifiesinformation about the composite media presentation and each clipassociated with the composite media presentation, each clip defines astart time and an end time of media content within a chunk that is to beplayed, each chunk being an individually addressable portion of mediacontent that is cacheable by a network infrastructure, the compositemanifest describes a plurality of chunks of media content forming thecomposite media presentation and an order for playback of the chucksbased on the sequence of clips.
 2. The system of claim 1 wherein receiveinformation identifying one or more clips further comprises receiveselections of one or more previous presentations and identify start andstop times of one or more locations within the previous presentationsfrom which to create clips.
 3. The system of claim 1 wherein create thecomposite manifest further comprises order clips so that a computer canplay clip after clip as a single presentation to the user without askingthe user to identify each subsequent item to play.
 4. The system ofclaim 1, wherein at least one clip is part of an on-demand presentation.5. The system of claim 1, wherein at least one clip is part of a livepresentation.